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INTRODUCTION 

Th«  area  in  which  problt  analysis  is  generally  utilized  is  that  of 
biological  assay,  where  biological  assay  should  be  understood  to  mean  the 
determination  of  potency  of  a  stimulus,  whether  physical,  chemical,  biological, 
psychological  or  physiological,  by  means  of  the  reactions  which  it  produces  in 
living  matter.  Biological  assay  is  nKJSt  commonly  considered  as  referring  to 
the  assessment  of  the  potency  of  vitamins,  hormones,  toxicants,  and  drugs  of 
all  types  by  means  of  the  responses  produced  when  doses  of  these  preparations 
are  given  to  suitable  experimental  animals. 

One  type  of  assay  which  has  been  found  valuable  in  many  different  fields, 
but  especially  in  toxicological  studies,  is  dependent  on  a  quantal  response. 
Hiough  quantitative  measurement  of  response  is  always  to  be  preferred  when 
available,  there  are  certain  responses  which  permit  no  graduation  and  which 
can  only  be  expressed  as  "occurring"  or  "not  occurring".  The  most  obvious 
example  of  this  is  death,  although  workers  with  Insects  have  often  found 
difficulty  in  deciding  precisely  when  an  Insect  is  dead.   In  many  investi- 
gations, the  only  practical  interest  lies  in  whether  or  not  a  test  Insect 
is  dead  or  perhaps  in  whether  or  not  it  has  reached  a  degree  of  Inactivity 
such  as  is  thought  to  be  followed  by  early  d«»ath. 

One  feature  possessed  by  all  biological  assays  is  the  variability  in  the 
reaction  of  the  test  subjects  and  the  consequent  impossibility  of  reproducing, 
at  will,  the  some  results  in  successive  trials,  however  carefully  the  experi" 
mental  conditions  are  controlled. 

The  statistical  treatment  of  quantal-response  data  has  been  much  aided 
by  the  development  of  problt  analysis.  This  method  has  been  widely  adapted 
as  the  standard  method  of  reducing  the  data  to  simple  terms. 


Unfortunately,  the  procedures  involved  In  the  maxlmuB  likelihood  method 
of  estimating  the  normal  equations  are  not  given  in  detail  in  any  of  the 
standard  reference  texts,  llierefore,  the  purpose  of  this  report  is  to 
outline  these  procedures  and  to  show  their  derivations. 

APPLICATION  TO  A  GENERAL  PROBLEM 

The  need  for  probit  analysis  arises  from  a  general  toxicology  experiment 
where  various  concentrations  of  the  chemical  are  prepared  and  a  batch  of 
insects  is  assigned  at  random  to  each  concentration  level.  The  chemical  is 
applied  and,  for  each  batch,  a  count  is  made  of  the  total  number  of  insects 
(n)  and  the  number  killed  (r).  The  ratio  of  the  number  killed  to  the  total 
number  gives  the  sample  death  proportion,  p  •■  r/n. 

For  any  one  subject,  there  is  a  level  of  intensity  of  the  stimulus  below 
which  a  response  does  not  occur  and  above  which  it  does  occur.  This  level  is 
referred  to  as  the  tolerance  (or  threshold)  for  that  subject.  Let  this 
tolerance  (or  threshold)  of  a  subject  be  represented  by  X;  then,  in  a  popv 
lation  of  subjects,  the  main  concern  is  with  the  distribution  of  X,     This 
distribution  of  tolerances  may  be  expressed  by 

dir  -  f(A)  dX;  (1) 

this  equation  states  that  a  proportion,  dir,  of  the  whole  population  consists 
of  individuals  whose  tolerances  lie  between  X  and  X  +  dX,  where  dX  repre- 
sents a  small  interval  on  the  dose  scale,  and  that  dw  is  the  length  of  this 
interval  multiplied  by  the  appropriate  value  of  the  distribution  function, 
f(A). 

If  a  dose  X^  is  given  to  the  whole  population,  all  individuals  will 
respond  whose  tolerances  are  less  than  X  ,  and  the  proportion  of  these  is 


tf   where 

X 

TT  =-  /  °  f(X)  dX;  (2) 

o     ' 
o 

the  measure  of  dose  Is  here  assiaaed  to  be  a  quantity  which  con  conceivably 
range  from  zero  to  +  •»,  response  being  certain  for  very  high  doses,  so  that 


/ 


o 


f(X)  dX  »  1.  W 


The  distribution  of  tolerances,  as  measured  on  the  natural  scale,  may 
be  markedly  skew,  but  it  is  often  possible,  by  a  simple  transformation  of 
the  scale  of  measurement,  to  obtain  a  distribution  which  is  approximately 
normal.  Although  the  distribution  of  tolerance  concentration  of  a  toxic 
agent  is  usually  far  from  symmetrical,  on  account  of  a  few  individuals  with 
high  tolerances  providing  an  extended  "tail"  to  the  distribution,  normal- 
ization can  often  be  effected  by  expressing  the  tolerances  in  terms  of  the 
logarithms  of  the  concentrations,  instead  of  the  absolute  values;  this 
transformation  is  now  accepted  as  standard  practice  for  expressing  the  results 
of  such  trials.  The  use  of  the  log  concentration  for  measuring  the  dosage 
in  quantal  trials  requires  no  more  Justification  than  that  it  introduces  a 
simplification  into  the  analysis.   (Finney  1952a) 

It  is  convenient  to  take  x  as  representing  the  intensity  of  the 
stimulus  on  the  scale  on  which  the  tolerances  are  normally  distributed,  and 
X  as  the  untransformed  value  of  concentration.   Thus,  for  much  quantal 
response  work, 

»  -  logjQ  ^;  (♦) 

X  will  be  referred  to  as  the  dosage  and  X  will  be  referred  to  as  the  dose. 


In  an  investigation  for  which  tolerance  can  be  satisfactorily  defined, 
so  that,  for  any  given  dose,  all  individuals  with  equal  or  lower  tolerance 
values  will  respond,  a  graph  of  the  sample  death  proportion  responding  against 
the  dose  will  give  a  steadily  rising  curve.  The  rate  of  increase  in  response 
per  unit  Increase  in  dose  is  frequently  very  low  with  minimal  and  maximal 
dosage,  but  higher  with  intermediate  values,  so  that  the  curve  is  sigmoidal* 
When  the  stimulus  is  measured  in  dosage  units,  the  curve  takes  the  charac- 
teristic normal  sigmoid  form.  This  curve  does  not  attain  the  OZ  or  100% 
response  except  at  infinitely  low  or  infinitely  high  dosage,  a  situation 
that  does  not  truly  arise  (except  that,  when  the  measure  of  dosage  intensity 
is  logarithmic,  an  infinitely  low  value  represents  zero  dose). 

Assuming  that  tolerance  measures  in  a  population  are  normally  distri- 
buted, on  the  dosage  scale  the  relationships  between  the  normal  curve  and 
the  problt  transformation  can  be  seen  from  Figure  1. 
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Figure  1.  Relationship  Between  the  N.E.D.  and  the  Problt  Transformation 


The  dosage  deviations  from  the  mean  (x  -  x)  are  replaced  on  the  base  line 


(x^-x) 


or  what  is  called  the  normal  equivalent  deviate  (N.£.D.) 


This  transformation  to  normal  equivalent  deviates  was  first  made  by  Fechner 
(I860),  but  was  not  considered  seriously  until  it  was  made  again  by  Gaddum 
(1933).  Bliss  (1934)  suggested  adding  a  constant  (five)  to  the  normal 
equivalent  deviate  to  remove  negative  numbers  and  also  suggested  the  name 
of  the  transformation,  probit.  Therefore,  the  probit  of  the  proportion  ir 
is  defined  as  the  abscissa  which  corresponds  to  a  probability  v     in  a 
normal  distribution  with  mean  ■>  5  and  variance  unity;  in  symbols,  the  probit 
of  ir  is  y  where 

y-5    'jv 
I  ^  du,  (5> 


where  c  is  3.14159  , 

_  1 
If  dw  -    ^    e     \^)    ^     represents  the  element  of  probability 

from  the  distribution  of  tolerances  on  the  x  scale  of  dosages,  the 

expected  proportion  of  insects  killed  by  a  dosage  x   is 

o 


» 


/''«  e"Ho  J 


dx  (iy 


°  vf^To    o  o 

Comparison  of  the  two  preceding  formulas  for  tt  shows  that  the  probit  of  the 
expected  proportion  killed  is  related  to  the  dosage  by  the  linear  equation 

y  -  5  +-i-  (x  -  y).  (7) 

which  says  that  the  probit  equals  the  N.E.S»  +5.  A  probit  of  5  implies  a 
mortality  of  .5  .   (Finney,  1952a), 


since  a  straight  line  graph  is  expected  when  problts  are  plotted  against 
dosage,  the  methods  of  linear  regression  are  suggested.  The  measures  that 
need  to  be  extracted  from  the  linear  regression  are  those  of  potency  and 
sensitivity  of  dosage.  The  potency  generally  regarded  as  best  for  making 
comparisons  among  drugs  is  that  giving  a  50%  kill,  and  is  referred  to  as  the 
LO  50.   (lethal  dose)  In  experiii»nts  in  which  death  is  not  the  response,  the 
ED  50  (median  effective  dose)  is  used. 

The  sensitivity  is  the  range  of  dosage  required  for  a  given  range  of 
percentage  kill.  If  a  small  change  in  concentration  gives  a  wide  range  in 
the  percentage  kill,  the  sensitivity  is  high.  The  slope  of  the  regression 
line,  b,  is  associated  with  the  sensitivity.  The  greater  the  slope,  the 
narrower  the  range  in  dosage  for  a  given  range  in  the  percentage  kill. 

THE  METHOD  OF  MAXIMUM  LIKELIHOOD 
The  Basic  Assumptions 

The  model  used  is 

yj  -  W  +  6  (Xj  -  3J)  +  e^  ,  j  -  1,  2,  .  .  .  ,  k         <$) 


■  the  number  of  doses. 


.   will  be  defined  as 


N 


p  +  e  (X  -  X).  <f) 


It  is  assumed  that  E  fe  7  ■•  0.   n.  is  the  dosage  which  gives  a  ir 

response  or  may  also  be  defined  ma   the  point  on  the  x  axis  which  gives  x.. 

«.  will  be  defined  as 
J 


N  '  ^  '  N 


(10) 


vhere  p.  Is  the  sample  death  proportion  at  dosage  x.,  r   is  the  number 
affected  and  n.  is  the  number  in  the  trial* 

Now  let  a  tolerance  distribution  be  represented  by  (1),  so  that  the 
probability  of  response  to  a  dose  X   is  ir   as  defined  by  (6).  If  a 
batch  of  n  subjects  is  exposed  to  the  stimulus  at  a  dose  X  ,  and  if  the 
subjects  react  independently  of  one  another,  the  probability  of  r  responses 
is  given  by  the  binomial  distribution  as 


(U) 


Assume  that  a  series  of  k  doses  is  tested  in  an  experiment;  then  the 
probability  of  a  particular  number  killed  in  each  group  is  proportional  to 

e   where 


K  K 

L  -   J  r   log  w  +  I     (n  -r  )  log  (1  -  ir  )   . 
j-1  J       •*         j«i   J  J  J 


(12) 


The  quantity  e   or  more  strictly,  a  quantity  proportional  to  it  but  having 
a  maximum  value  of  1,  has  been  called  the  likelihood  of  the  observations. 
Or  if  the  likelihood  function  (L)   is  defined  as 


L-  n  ^V.    VJ   i-'.\  V^J    .  (13) 


f     : 


k      /  "j\     k  k 

log  L  -  B  -       I  log  \+       I     r     log     ir     +       I     (n     -  r  )   log   (I  -  w  ).   (14) 

J-1         V  Tj  j       j-1     -^  J         J-1       J         J  ^ 

The  Estimation  of  p  and  3 

Now  ir   and  (1  -  i  )  are  functions  of  the  dose  which  contain  certain 

parameters,  and  the  next  problem  is  that  of  estimating  the  parameters  from 
the  experimental  data.  The  likelihood  is  a  maximum  when  s  is  a  maximum} 
therefore  if  v   and  &   are  parameters  of  the  distribution  of  individual 
tolerances,  the  maximum  likelihood  estimates  of  u  and  3  must  satisfy  the 
equations 


-JS-  -   -S2-  „  0.  (15) 

A  composite  derivative  must  be  taken  to  estimate  y  and  3  »  since  s  is 
a  function  of  ir  ,  »   is  a  function  of  n.  and  n.  i«  determined  by  these 

two  parameters. 


^             k        -                  a*.              3n. 
is.   .       y     -is-    .    L    .    I 

3u  ^j^        3ir  3n,  i\i 


T  (n.-r  )        (1-if  )(r  -IT  )(n  -r  )         r  -w  r  -ir  n  -hr  r 

.J2.   »   JjL  .   LJL  ,1      11      11      m    1    1  11  1    11 

3w.  ir  (1-ir.)  ^,(1  -  V  )  w   (l-ir  ) 


'^I'^^l  (16) 


since     p,  -  r./n  ,   r     =  n  p     ;   so  that 


=»  -  '■■''»"t°' 


in. 


•j"  -  •]' 


°i'''^^' 
.j(i  -.j> 


dir 


»n. 


dF(  n^) 


J  2c 


1  2 

2  '^j 


Z  ,  an  ordinate  of  a  N(0,1} 


in.  a  [y  +  e(x^-^)J 


sw 


su 


Therefore 


is. 

3y 


3-1   '^J^^-'J^ 


(17) 


To  find  the  estimate  of  0,  a  similar  derivative  will  be  used. 


is. 

3B 


r   _3s 


air 


3n. 


3n 


(18) 


Utilizing  the  results  from  the  previous  function,  it  can  be  seen  that 


10 


<      "       'j  • 


3n. 


The  solution  for        '  ..'*'     1« 

dp 


in  3  [ u  +  e(x  -  x)J 


33  38  ^"J 


(x,   -  X) 


3_  k       n  Z   (p  -n  )(x  -X) 

Therefore    TT      "         I       ■  ■'  ■'      ''     ■'        ■  '  (19) 

'^  j-1      ITjd  -  ITj) 

The  Weighting  Coefficient 

The  reliability  of  the  problt  for  an  observed  percentage  kill  depends 
not  only  on  how  numy  individuals  were  counted  to  determine  this  percentage 
but  also  upon  the  corresponding  problt  value  of  the  regression  line,  or, 
in  actual  practice,  upon  that  of  the  provisional  regression  line.  It  is 
customary  to  consider  the  reliability  of  a  percentage  as  proportional  to  the 
nuDtber  of  Individuals  tested.  Thus  the  Justification  for  weighting  by  the 
number  of  individuals,  rather  than  by  the  square  root  of  the  number  of 
individuals,  is  that  the  reliability  of  a  measure  is  Inversely  proportional 


11 


to  the  square  of  its  standard  error— the  variance and  not  to  the  standard 

error  itself.  The  variance,  in  turn,  is  a  function  not  only  of  the  number  of 
cases,  but  also  of  several  other  important  factors  which  will  now  be  con- 
sidered. 

The  standard  error  needed  is  not  that  for  a  proportion  v,     but  rather 
that  for  the  corresponding  inferred  dosage  or  probit,  x,,  which  is  equivalent 
to  the  percentile.  The  formula  for  the  variance  of  a  percentile  is  given  by 
Kelley  (cited  in  Bliss,  1935  page  150) 


o2ir.(l  -  ir,) 

L L.  (20) 

Z^nj 

where  a     is  the  standard  deviation,  z       is  the  ordinate  of  the  normal  curve 
and  is  given  in  tables  of  the  probability  integral,  and  the  other  terms  have 
their  previous  significance.  This  will  also  be  the  variance  for  the  probit 
of  a  single  observed  percentage  mortality,  but  since  the  probit  is  already  in 
terms  of  the  standard  deviation,  a^     is  always  equal  to  1  and  the  variance 
of  a  probit  may  be  simplified  to  the  form 


»,(!  -*  If.) 

-J-T i—     .  (21) 

In  order,  therefore,  to  give  each  observation  a  weight  proportional  to  its 
true  reliability,  instead  of  multiplying  it  by  n,  we  multiply  by  the 
reciprocal  of  the  variance  as  the  weight,  w  .  Uence 


u 


2 
where     n,^  Z     ,   ir     and  I  "  if.     have  their  previous  significance*     The  term 


^l 


WjCl  -  w^) 


will  be  called  the  weighting  coefficient*  (Bliss,  1935) 


Thertfor.   — fj  -   J (23) 


The  Re~definition  of  the  Normal  Equations 

Equations  (23,  (24),  cannot  be  solved  as  they  are,  so  that  a  new  definition 
will  be  used  in  an  effort  to  solve  them*  Let 

~U  -  ♦i(P,B) 

and  (25) 

8s 


33 


-  ^2^^,^) 


Tha  original  conditioa 


is.  .  is.  .  0 

3y       36 


•till  holds,  •©  that  the  two  n«w  aquations  must  equal  sero  or 


U 


♦jCu.P) 


♦2<w.B) 


-  0. 


(26) 


ProB  a  Taylor's  expansion,  any  function  can  be  \nrltten  as 


f(x,y)  *  f(«^,y^)  ■*■    f 


X  y 
o'o 


AX    ^    -1^ 


Ay  +  h(A)  -  0. 


X  y 


Tlierefore,  (If  we  neglect  h(&)  which  involves  higher  powers  of  A  and  A  ), 


AM  ^  2!i 

o  ^-o'^o 


Ay 

36  H^.6^ 
o  o 


-  0 


♦^di.B)  *  ♦2<''o»V  *  "3r 


Au    ^   2 


o    o'  o 


A6 


(27) 


Thus,  in  matrix  notation. 
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-  0     (29) 


For  convenience,  the  right  hand  side  will  be  designated  as 


But 


(30) 
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LA3 


0-0, 


(31) 


A  good  guess  of  y  is   p   and  of  0  is  0  .  So  consider 
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Then 
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where 


Ae 


The  Evaluation  of  the     V    >fatrix 


All 


-I 


The  next  problem  is  the  evaluation  of  the  V  taatrlx, 


Since    ~    -   <t»j^(ji,6) 


(33) 
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mmmmm 
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36 
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(34) 


Composite  derivatives  must  be  used  to  evaluate  each  of  the  elements  In  this 
'      matrix 
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From  the  fact  that  r-    "  ♦j^(y.6) 


^     J-1        ^J 
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The  right  hand  term  Involves   (p,  "'■'.)   and  Is  assumed  for  our  purposes, 
to  be  very  small.  Therefore 


3*j 


A      - 


°A 


.j<i-.j) 


Freai  previous  work. 
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Hierefore  the  left  hand  term  of  the  composite  derivative 
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•■     I     (from  previous  work) 


Therefore  the  right  hand  term  becomes 


k 

I 
J-1 


njZj 


The  entrie  composite  derivative  is 


(37) 


32 

s 

3m^ 


k 

j-1 


!i!i 


.j(l-nj) 


TT,  (l-ir  ) 


The  right  hand  term  involves  (p.—n.)   and  is  again  assumed  to  be  very  small. 
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Therefore, 
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k 
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v^ 
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"j"j 


(38) 


This  is  the  (1,  1)  eleiMnt  of  the  V  matrix* 


The  next  elenent  to  be  found  Is 
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3*,    3z 


3n< 


3n 

30 


(39) 


Each  of  the  terns  of  this  composite  derivative  has  been  found  In  previous 

work. 

Therefore 
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ir^d-TTj) 
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'  *  This  function  equals  zero  since  7  (x  -x)  -  0,  thereby  reducing  to  0  the  off 

e  -J 

diagonal  elements  (1,  2,  and  2,   1)  of  the  V  matrix. 

The  last  element  of  the  V  matrix  Is  given  by         -  i 


From  previous  work,  applying  the  weights  (22), 
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Again  from  previous  work. 
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The  right  hand  term  Involves  (p,  -  ir  )  and  la  assoDed  to  be  very  •mall. 
Therefore, 


3*2      1^  -  2 

*   i   -  n  w  (x  -  x) 

j-l     ^  ^  ^ 


ae 


(42) 


This  Is  the  (2,2)  term  In  the  V  matrix  and  completes  the  computations 
for  the  V  matrix.  ,   " 

The  V  matrix  now  becomes 


J-1  ^  ^ 


and 
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I     n,w^(x,  -  x) 
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(43) 
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Finding  Values  for  u-^     and  B^ 


Now  substituting  the  proper  values  into  (33)  gives 


I  n  w 
j-1  ■'  •' 
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n.w  (x-x) 


(45) 


Frcra  (45),  values  of  y^   and   Bj^  are  given  by 
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Eatimatea  for  the  values  of  \i       and  3   can  be  found  from  the  sight  line. 

o  o 


These  are 
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Then  an  estimate  for  y.  would  be  given  by 
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where     7  «     ■     n.     +      '  ■'  '  ■'       is  called  the  working  problt  for     p^ 
Similarly  an  estimate   for     3^     would  be  given  hy 
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where  ^  <  "  n  .  +    '   '     is  called  the  working  problt  for  0  , 

J  J     * 

Maximum  and  Minimum  Working  Probits 

A  A 

The  working  probita  for  p   and  6.  are  given  above  but  n° 

-  '4/^4  *■■  tabulated  as  the  minimum  working  problt  and  —   as  the 

range"*  Therefore,  the  working  problt  is  equal  to  the  minimum  working 
problt  +  (p  )  (range). 


» 


y  J  -  w.p  .  n  J  -  ^  -^  ^ 

r  (52) 

Or  the  working  probit  is  equal  to  the  raaximum  working  problts  minus 

(1  -  p  )  (range) ,  where  the  maximum  working  probit  is  defined  as  ' 

y  .  *  working  probit  -  1  .  +    4     "    7 


1  -  It.  -  1  +  p, 
^o  ^  L..^ 1  (52) 


v>.    -  V. 

The  preceding  mathematical  treatment  of  probit  analysis  would  be  used 

by  starting  with  some  initial  values  for  v       and  3^  and  substituting  them 

into  equation  (45)  to  obtain  values  for  p.   and  6,.  Then  v^   and  &^   would 

be  re-entered  into  (45)  to  obtain  values  for  m^  ^n<^  ^o*  ^^*  re-iteratlon 

would  continue  until  values  for     y       and     g       are  within  pre-set  tolerance 

n       n 

limits  of  u  ,   and  6  ,.   Each  of  the  stationery  values, 

n— 1       n— i 
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k  (x.-x)(p     -V  ) 

y     g„      ■■■  1  1       .1       could  be  coniputed  aad  would  remala  as  constaats  In 

J-1   ^      *J 

^  the  calculations.  Usually  only  one  or  two  Iterations  are  needed  to  bring  the 

values  very  close  together. 

Heterogeneity 

If  the  reactions  of  the  Individuals  In  a  batch  are  not  Independent  of 
one  another,  the  weights  nw,  though  still  proportional  to  the  tru«  weights, 
will  be  too  large,  and  the  estimated  variances  will  therefore  be  too  small. 
This  will  be  Indicated  by  a  large  value  of  a  statistic,  x^»  which  will  be 
seen  to  be  a  weighted  swn  of  squares  of  the  discrepancies  between  the  expected 
and  observed  number  tillea.  Since  the  expected  value  of  x'  is  Its  nunber  of 
degrees  of  freedom,  a  significantly  large  factor  Indicates  that  all  weights 
have  been  over  estimated  by  a  factor  x^/i^"2,  where  k  Is  the  number  of 
dosages  tested.  All  variance  should  therefore  be  multiplied  by  this  hetero- 
geneity factor  as  conpensatlon  for  the  overweighting. 

The  x^  test  for  the  heterogeneity  of  the  discrepancies  between  ob- 
served and  expected  numbers  Is  only  valid  when  the  expected  numbers  are  not 
small,  usually  less  than  five.  At  the  more  extreme  dosages  tested  either 
ir  or  1  -  IT  Is  often  nearly  zero,  so  that,  with  the  usual  ntanber  of  Insects 
exposed  to  the  stimulus,  either  the  expected  nuaber  killed,  nw,  or  the  ex- 
pected number  surviving,  n(l  -  ir) ,  is  too  small  for  a  x^  calculated  In  the 
usual  manner. 

Using  the  results  from  the  maxlmum-llkellhood  method,  tlie  following 
equation  (Finney,  1952b)  may  be  used  to  compute  x^< 
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The  Variance  of  the  Estimates 
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(Finney,  1952a  page  54) 
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Fiducial  Limits 

If  no  allowance  has  been  made  for  heterogeneity,  tlie  variance  of  Y  is 
given  as  in  (58),  But  if  the  heterogeneity  factor  is  significantly  greater 
than  1,  V(Y)  must  be  multiplied  by  this  factor.  Therefore,  fiducial  limits 
to  Y  ara 


Y  +  8^   t 


(60) 


where  8„  is  the  square  root  of  V(Y)  and  t  is  the  normal  deviate  for  the 
level  of  probability  to  be  used.  If  there  is  significant  heterogeneity,  the 

t  -  value  corresponding  to  this  probability  should  be  used.  "'  j, 

*      *  -  \- 

If  the  fiducial  limits  of  Y  are  plotted  for  each  x,  they  will  be 
fotmd  to  lie  on  two  curves  which  are  convex  to  the  regression  line  and  which 
approach  this  line  most  closely  at  the  dosage  x.  The  further  x  is  removed 
from  X  in  either  direction  the  greater  is  the  contribution  to  the  variance  . 
of  Y  from  the  second  term  of  (58),  which  represents  the  effect  of  the  errors 
of  estimation  of  the  regression  coefficient  b,  and  consequently.  Fiducial 
Limits  premorewidely  spread.   These  limits  are  the  values  of  x  for  which 
the  boundaries  of  the  fiducial  band  attain  the  selected  value  of  Y.   (Finney 
1952a) 

Exact  fiducial  limits  to  x,   the  dosage  giving  a  kill  whose  probit  is 
Y,  is  found  by  solving  an  equation  so  as  to  obtain  the  value  of  x  for  which 
Y  has  a  selected  fiducial  limit.   These  limits  are 


1-g 


(x  -  x)  + 


b(l-g) 


i=&. 
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-1  J 


k 
J-1 


(60) 


za 


2 
where  p  =  ^ .  Significant  heterogeneity  must  be  allowed  for  by 

b'  I  XX 

increasing  both  g  and  the  expression  within  the  square  root  by  the 
heterogeneity  factor,   (Finney,  1952a) 

If  g  is  small  (less  than  about  0.1).  the  exact  equations  for  confidence 
limits  simplify  to  those  obtained  from  the  approximate  variance.  Specifically, 
g  will  b«  small  under  all  conditions  that  reduce  slope  variance.  For  example, 
this  will  obtain  if  the  (a)  error  variance  is  small;   (b)   slope  itself  is 
too  steep;   (c)  dose  range  Is  large;   (d)  desired  level  of  confidence  is  not 
too  rigorous;  and/or  (e)  number  of  observations  is  large  so  that  t  will 
be  small.  Inasmuch  as  the  terms  comprising  g  have  to  be  found  anyway  for 
use  in  other  parts  of  any  data  analysis,  the  actual  computation  of  g  will 
entail  little  additional  work  and  should  be  performed  routinely.  If  it  is 
found  to  be  smaller  than  0,1,  it  can  be  dropped;  otherwise,  it  is  retained 
and  the  full  equations  for  exact  confidence  limits  must  be  used.   If  g  -  1» 
then  the  slope  will  not  differ  significantly  from  zero  and  no  confidence 
Interval  can  be  found,   (Goldstein,  1964) 
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APPENDIX 

A  Monte  Carlo  technique  was  used  to  generate  the  data  to  show  the 
procedures  involved  in  a  probit  analysis  from  a  toxicological  experiment. 
It  was  theorized  that  a  certain  relationship  existed.  This  was  that 

Y  -  3  +  2  log  X, 
where  Y  represented  the  probit  of  tt.  ,  3  was  the  intercept  and  2  was 
the  slope  of  the  regression  line.   Log  x  was  found  by  placing  Y  equal  to 
the  probit  at  10%,  20%,  •  •  •  ,  90%,  and  solving,  p  ,  the  number  affected, 
is  found  by  using  a  table  of  random  numbers.  As  an  exanple,  for  the  10%  case, 
100  numbers  were  scanned  and  all  numbers  less  than  10  were  noted.   These  con- 
stituted the  number  killed  by  the  lowest  dosage.  When  this  number  is  divided 
by  100,  the  result  will  be  p..  Then  using  a  different  set  of  100  random 
numbers  each  time,  p.»  •  .  •  ,  Pg,  may  be  found. 

These  values  of  p.  are  then  transformed  Into  initial  probits  by  using 
Table  VI  in  the  Blometrlka  Tables.  Tliis  will  be  the  column  of  the  Y's, 

The  values  of  log  x  (X)  and  initial  probits  (Y)  are  plotted  against  each 
other  to  give  a  scatter  diagram  of  the  points.   Log  x  is  on  the  abscissa 
and  the  Initial  probits  are  plotted  on  the  ordinate.  A  provisional  regression 
line  is  then  drawn  by  sight.  The  line  should  be  close  to  all  of  the  points 
but  where  they  are  widely  divergent,  get  the  best  fit  on  the  points  between 
four  and  six  probits. 

The  points  on  the  provisional  regression  line  Intersecting  with  the  log 
X  values  are  the  values  for  the  conditional  probit,   n°.»   These  conditional 
probits  are  then  used  to  find  the  minimum  working  probit,  the  range,  and  the 
walghting  coefficient,  from  Table  VI  in  the  Biometrika  Tables. 


M 


The  sum  of  the  mlaimira  working  probit  and  the  product  of  p   and  tha 

* 
vans*  gives  the  values  of  y  .  Then  values  for  nw  ,  nw  x.,  (nv.x.)x., 

*      *  *   * 

nw.x  y  ,  ^^*y*   »  "*^  (nw.y.  )y   are  found  for  each  log  x. 

The  first  iteration  values  for  a.  and  b.,  which  are  the  estimators 
for  V  and  0,  are  given  by 


a  ■  y  -  bx 
I 
and  b  «    '  ^ 


where 
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Then  the  values  for  a.  and  h^     are  substituted  into  1  ^  -  a+bx 

(or  a  +  b  log  x)  to  give  the  formula  for  the  next  iteration. 

To  start  the  next  iteration,  n^.   is  evaluated  and  produces  that  column 

in  the  table;  log  x  and  p   stay  the  same.  The  same  procedure  as  before 

2        .  ,     2 
is  followed  to  give  values  for  a^     and  b2  in  n  j  -  a+bx.  n  j  i« 

evaluated  and  if  the  resultant  values  are  within  a  pre-set  tolerance  limit, 

halt  iterations  and  determine  values  for  m  and  b.  If  the  tolerance  limit 

is  not  met,  another  iteration  must  be  performed. 

The  values  for  the  problts  in  the  Biometrika  Tables  are  given  to  two 

places  so  that  the  pre-set  tolerance  limit  is  met  after  the  second  iteration, 

1        2 
where  there  is  no  differance  between  n  .  and  n  ,  • 

For  this  example,  m,  the  log  of  the  LD  50  is  1  and  the  slope  of  the  line, 
b,  is  2.10742. 
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Tbm   area  In  which  problt  analysis  is  generally  utilized  is  that  of 
biological  assay.  The  type  of  assay  found  most  valuable  in  many  different 
fields,  but  especially  in  toxico logical  studies,  is  dependent  upon  a 
quantal  response;  that  is,  one  which  can  be  expressed  as  "occurring"  or 
"not  occurring**.  A  feature  possessed  by  all  biological  assays  is  the  vari- 
ability in  the  reactions  of  the  test  sxibjects  and  the  consequent  impoosibility 
of  reproducing  the  same  results  in  successive  trials* 

The  purpose  of  this  report  Is  to  show,  in  greater  detail,  the  procedures 
involved  in  the  maximum  likelihood  method  of  estimating  the  normal  equations 
in  a  probit  analysis. 

In  a  general  toxicology  experiment,  various  concentrations  of  the  chemical 
are  prepared  and  applied  to  a  batch  of  insects  ^ich  have  been  assigned  at 
random  to  each  concentration  level.  The  total  nioaber  (n)  and  the  number 
killed  <r)  are  counted  and  the  ratio,  p  •■  r/n,  gives  the  sample  death  pro* 
portion. 

The  tolerance  for  any  one  subject,  the  level  below  wiiich  a  response  does 
not  occur  and  above  which  it  does  occur,  is  represented  by  A,  The  main  con- 
cern, for  a  population  of  subjects,  is  the  distribution  of  X,  which  may  be 
expressed  by  dir  «  f(X)  dA. 

If  a  dose  X   is  given  to  the  whole  population,  all  individuals  will 

respond  whose  tolerances  are  less  than  X  ,  and  the  proportion  of  these  is 

w,  where  ir  ■  /  o  f(X)  dX  . 
o 

When  the  distribution  of  tolerance  concentrations  is  measured  cm  a 
natural  scale,  the  curve  may  be  far  from  synmetrical,  due  to  a  few  individuals 
with  high  tolerances.  In  such  a  case,  normalization  may  be  aclieived  by  expres- 
sing the  tolerances  in  terms  of  the  logarithms  of  the  concentrations.  Letting 


X  represent  the  inteasity  of  the  stimulus  and  A  the  concentration  of  the 
stimulus,  then  x  »  ^°gio  ^  »  ^  ^^^     ^     "^^^  ^^  referred  to  as  the  dosage 
and  dose,  respectively. 

The  relationship  between  the  sample  death  proportion  and  the  dose 
yields  a  aonotonic  function,  and  when  the  stimulus  is  measured  In  dosage 
units,  the  curve  has  the  normal  sigmoid  form. 

If  the  dosage  deviations  from  the  mean,  in  the  normal  curve,  are  re*- 
placed  by  the  normal  equivalent  deviate  +5,  the  problt  transformation  Is 
reali;:ed.  The  probit  of  tlie  proportion  it  is  defined  as  the  abscissa  which 
corretipoods  to  a  probability  it  in  a  N (5,1), 

The  method  of  maximum  likelihood  was  used  to  solve  for  the  estimators 

for  \i     and  g  in  the  regression  equation,  y.  =  y  +  6  (x  -  x)  +  e  , 

J         J        J 

If  a  batch  of  n  subjects  is  exposed  to  the  stimulus  at  a  dose  X  .  and 

o' 

if  the  subjects  react  independently  of  one  another,  the  probability  of  r 
responses  is  given  by  the  binomial  distribution.   The  maximum  likelihood 
estimates  of  u  and  6  ,  parameters  of  the  distribution  of  individual 
tolerances,  were  found  by  a  numerical  solution  of  the  normal  equations. 

The  variance  of  the  estimates  and  the  fiducial  limits  are  given  for 
the  maximum  likelihood  method, 

A  numerical  example,  solved  by  a  Monte  Carlo  technique,  is  worked  to 
show  the  procedures  used  In  analyzing  relevant  data. 


